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c ^-, ABSTRACT 
O 
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A 0.2-12 keV spectrum obtained with the XMM-Newton EPIC/pn instrument 

Ph 1 of GRB 011211, taken in the first 5 ksec of a 27 ksec observation, was found 

C/} ■ by Reeves et al. (2002; R02) to contain emission lines which were interpreted 

m ■ to be from Mg XI, Si XIV, S XVI, Ar XVIII, and Ca XX, at a lower-redshift 



(N 



{z bs = 1-88) than the host galaxy (zh os t = 2.14). We examine the spectrum 



CN ■ independently, and find that the claimed lines would not be discovered in a 

' blind search. Specifically, Monte Carlo simulations show that the significance 

of reported features, individually, are such that they would be observed in 

10% of featureless spectra with the same signal-to-noise. Imposing a model in 

which the two brightest lines would be Si XIV and S XVI Ka emission velocity 

shifted to between z=l. 88-2. 40, such features would be found in between 

■ ~1.3-1.7% of observed featureless spectra (that is, with 98.3-98.7% confidence). 

^P" 1 ' When we account for the number of trials implicit in a search of five energy 

O 

[z = 2.14±1.0), the detection confidence of the two line complex decreases to 
77-82%. We find the detection significances to be insufficient to justify the claim 



of detection and the model put forth to explain them. Ka line complexes are 
also found at z — 1.2 and z = 2.75 of significance equal to or greater than that 
at z — 1.88. Thus, if one adopts the z = 1.88 complex as significant, one must 
also adopt the other two complexes to be significant. The interpretation of these 
data in the context of the model proposed by R02 is therefore degenerate, and 
cannot be resolved by these data alone. Our conclusions are in conflict with 
those of R02, because our statistical significances account for the multiple trials 
required - but not accounted for by R02 - in a blind search for emission features 
across a range of energies. In addition, we describe a practical challenge to the 
reliability of Monte Carlo Ax 2 tests, as employed by R02. 
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Subject headings: gamma rays: bursts — gamma rays: observations 

1. Introduction 

It was recently reported that the X-ray afterglow of gamma-ray burst GRB 011211, 
as observed with XMM-Newton EPIC/pn, contained spectral emission lines (Reeves 
et al. 2002a, hereafter, R02) - the first report of multiple X-ray emission lines from a GRB. 
These lines, at 0.45, 0.70, 0.89, 1.21, and 1.44 keV were interpreted to be from He-like 
Mg XI (rest energy 1.35 keV) and H-like Si XIV(2.0 keV), S XVI (2.62 keV), Ar XVIII 
(3.32 keV) and Ca XX (4.10 keV), redshifted to £=1.88. The difference between this and 
the known redshift of the host galaxy z host = 2.14 was modeled as due to supernova ejecta 
traveling at v = 25800±1200 km s _1 , which had originated during a supernova 4 days prior 
to when the GRB jet illuminated it, producing the afterglow (a more detailed analysis by 
the same authors was completed after this paper was in its initial form; Reeves et al. 2002b, 
R02b hereafter 

The statistical significance of the individual lines was not reported in R02; it was stated 
that joint analysis of the lines taken together produced an improvement in the x 2 value 
which, by an F-test, yielded a significance level of 99.7%. In addition, it was found that 
Monte Carlo (MC) simulations were unable to produce the the same improvement in x 2 
found between the best-fit power-law model and the best-fit five emission-line model more 
than 0.02% of the time. Specifically, it was found that the best-fit x 2 value for a power-law 
model was improved by fitting to a model of a MEKAL plasma with emission lines at rest 
energies corresponding to unresolved Mg XI, Si XIV, S XVI, Ar XVIII, and Ca XX redshifted 
to z = 1.88 in only 0.02% of the simulated spectra (a 99.98% confidence detection). 

The implications of the model discussed by R02 - a delay between a supernova 
and a GRB on a timescale of days, the formation of a thin shell of supernova ejecta, 
an apparent under-abundance of Fe relative to the detected nuclei - provide severe 
constraints on gamma-ray burst emission models. In addition, as demonstrated by R02, 
the future detection of multiple emission lines can provide extremely strong constraints on 
the production mechanisms, due to the inherent required outflow velocity and emission 
timescales which can be derived from them, not to mention the implied association with 
supernovae. Similar spectra observed with greater S/N in the future would greatly aid in 
unravelling the emission mechanisms and geometry of gamma-ray bursts. Therefore, it is 
of wide theoretical (e.g. Lazzati et al. 2002; Kumar & Narayan 2002) and observational 
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interest to further interpret the observed X-ray spectrum of this GRB 011211, in hopes of 
determining what more could be learned from future, more precise observations. 

In Sec. 2, we describe the observation, and perform a basic spectral analysis using 
continuum models. In Sec. 3, we compare Monte Carlo (MC) realizations of acceptable 
continuum models with the GRB spectrum, and find that the reported features would 
be produced in ~10% of the continuum model spectra, due only to Poisson noise. In 
Sec. 4, we adopt the model that the two apparently most significant lines are Ka lines of 
Si XIV and S XVI; we perform a blind search for features of the same significance in MC 
realizations of continuum spectra, and find that they would be reported from ~1.2-2.6% of 
such spectra, again, due only to Poisson noise. We describe in Sec. 5 a practical challenge 
to the reliability of the MC A% 2 analysis produced by R02. We conclude in Sec. 6 that 
the lines are not individually significant in the absence of an imposed model, and are only 
marginally significant when the adopted model is imposed. 

These conclusions conflict with those of R02. R02 derived the model (that is, the 
observed line energies, or redshift) from the data, and then applied statistics for detection 
as if the energies were known prior to examining the data (that is, single-trial statistics). 
This is not appropriate when the model line energies are derived directly from the X-ray 
data, and not from an a priori model - one derived without examination of the X-ray data 
(for example: line energies of multiple features with redshifts of the host galaxy). We adopt 
statistics appropriate to a blind-search for these features, across a range of energies or 
redshifts (multi-trial statistics). This accounts for the diminished significance we find for 
the features. We further discuss the reasons for this conflict and conclude in 5 6. 



2. Observation and Observed Spectrum 

We analyzed the identical source and background X MM- Newton /EPIC-pn (Struder 
et al. 2001) spectrum as used by R02 (their Fig. 2), which was kindly made available to us 
by the authors in electronic form (J. Reeves, priv. comm.). We used the same response 
matrix (epn_f f 20_sdY9_thin.rsp). The spectrum used 5000 sec of realtime observation 
beginning at 07:14:33 UT on 12 Dec 2001, with a total live time of 4440 sec. The pn 
spectrum used counts comprised of patterns 0-4 (singles and doubles), from a circular 
region 46" in radius centered on the source, excluding flagged events (for which the keyword 
FLAG ! =0) and excluding a region near the edge of the CCD chip. 

We performed a basic spectral analysis using XSPEC vll.1.0 (Arnaud 1996). We 
used data in the 0.2-12 keV energy range. We performed a non-standard spectral binning, 
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implemented to maximize the signal-to-noise associated with the reported emission features 
at the reported energies. We first binned data with energy bins centered at the five best-fit 
line energies found by R02, with bin-sizes approximately equal to the FWHM EPIC/pn 
energy response at each energy (respectively: 62 eV, 66 eV, 68 eV, 72 eV and 75 eV; see 
Eq. 1). The remaining data were binned with 60 eV or greater (< 1 keV), and 70 eV or 
greater (>1 keV). Between 0.2 and 5 keV, each bin has >15 counts (although they were not 
binned on this basis), for which x 2 fitting is valid. 

We fit an absorbed photon power-law spectrum to the data, shown in Fig. 1. The 
model spectrum was statistically acceptable (model parameters are listed in Table 1, 
along with the obtained x 2 values). We also fit the model with a thermal bremsstrahlung 
spectrum (wabs*bremss), and derived an acceptable best fit. Finally, we found best-fit 
model parameters for the values of the power-law photon slope (models 2 and 3) and 
kT hremss (models 5 and 6) at the 90% confidence limits of the best-fit, which will be used in 
MC simulations in § 3. 

3. Individual Emission Line Significances 

We first determine which of the reported lines are individually statistically significant, 
when one is searching for emission lines at a priori known energies. We compared the 
observed spectrum with MC simulations of six featureless spectra - the three absorbed 
power-law and three absorbed thermal bremsstrahlung which are models 1-6 in Table 1. 

We used a "matched filter", convolving the observed pulse-invariant (PI) counts 
spectrum with a Gaussian energy response, with the energy resolution response of the 
detector. The matched filter approach maximizes the signal-to-noise ratio as a function of 
energy of unresolved lines in the X-ray PI spectrum. 

Based on Figure 18 in XMM-Newton vl.l Users' Handbook (Dahlem 1999), we 
modelled the photon energy redistribution as a Gaussian response, with FWHM: 

FWHM(E) = 57 + 13(£/lkeV) - 0.29(£/lkeV) 2 eV (1) 

This approximation was derived from the line in this figure. The EPIC/pn energy 
resolution has been demonstrated to be stable over 9 months of in-flight calibration (Striider 
et al. 2001). We expect that this analysis (and that of R02, since that work is based on the 
same energy response matrices) is valid as long as the energy resolution is within 20% of 
this approximation (corresponding to 3 of ~15 PI channels at 0.75 keV). 
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We performed a convolution between the raw PI spectrum (that is, number of counts 
vs. PI bin) and the gaussian energy response function, as a function of energy: 



where N is the number of PI bins, and we sum across PI bins which are within ±3cr(Ei) 
of E$. is the raw PI spectrum, which contains both source and background counts, 
and j = 1, 2, .., N is the PI bin number. The centroid (average) energies and energy widths 
(A£j) of the PI bins were taken from the EBOUNDS extension of the response matrix, where 
i is the PI bin number and a(E) = EWHM(E) /2.35. We do not correct the PI spectrum 
for the detector area; however the detector area does not change dramatically across the 
FWHM of the lines. If the area did change dramatically across the FWHM of a line, and 
a statistical excess were observed in the area-corrected PI spectrum but not in the raw PI 
spectrum, then such an excess could well be be due to calibration uncertainties. 

The resulting C(Ei) is shown in Fig. 2a. By visual inspection, there are indeed features 
in the spectrum near energies where the reported lines occur. 

To determine if these features are significant, we produced MC spectra of models 
1-6 (see § 2). The MC realizations of the raw PI spectra were performed as follows. We 
simulated the spectral models 1-6 in XSPEC, using the same response matrix as above, 
so that the resulting PI spectra (without Poisson noise added) were convolved as the 
observed spectrum through the telescope and detector response. The simulated PI spectra 
N(E) each had a total of >9xl0 8 counts in PI bins between 0.2-3 keV. We then produced 
integrated spectra 1(E) = J E 2keV N(E) dEj j^kev N(E)dE, so that 7(0.2 keV) = and 
7(3 keV) = 1 (the integrated normalized model is used for the MC simulation as described 
below). These constitute our six acceptable featureless spectral models; we will compare 
the data with results from all six, as a firm conclusion that emission lines are present should 
be independent of the underlying broad-band model assumed. 

We implemented a background spectral model, to simulate the ~10% of the counts 
due to background. Taking background from a different part of the detector, we find that 
it can be parameterized by a broken photon power-law (bknpower), with oi\ = 2.4 at low 
energies, break energy 1.35 keV, and a 2 =0.44 at high energies, between 0.20-7.3 keV (there 
is a strong background line at 8 keV). In fact, there are statistically significant deviations 
from this pure continuum model between 0.55-0.6 keV; we ignore these deviations in our 
background model. In our MC simulation the effect of ignoring what would appear to be 
a line in the observed spectrum is conservative, in the sense that by ignoring its presence 
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in the background model, we could detect as "significant" a line in the 0.55-0.6 keV range 
which is in fact produced by instrument background. 

We simulated spectra between 0.2 and 3 keV, in which there were 560 counts in the 
observed spectrum, of which we estimate ~66±1.2 counts are due to background. We drew, 
for each MC realization, a number of background counts which is random poisson deviate 
(using poidev, Press et al. 1995) with an average of 66 counts, with the remaining (of 560) 
counts from the source. To produce a simulated spectrum, we generate a random uniform 
deviate r between and 1, and we place a count in the PI bin in which 1(E) = r. 

To produce our confidence limits to C(E), we produced 1667 MC realizations each of 
models 1-6 for a total of 10002 realizations. We set the 99% and 99.9% confidence limits 
at the 100th and 10th greatest values, respectively, of C(E) of all such realizations. This 
insures that the conclusions are not dependent upon the assumed featureless spectral model. 

The results of this MC simulation are shown in Fig. 2a. Two of the reported features 
(near 0.7 keV and 0.85 keV; Si XIV and S XVI) have single-energy-trial probabilities of 
>99% confidence in comparison with the featureless spectral models (the claimed S XVI 
line peaks just below the 99.9% confidence limit; we will treat it as having met 99.9% 
confidence, while the reader may regard this as an upper-limit). The remaining three lines 
are not significant in comparison with single-energy-trial probability of 99% confidence. 

In Fig. 2, we also show C(E) for single MC realizations of the best-fit power-law 
spectrum (model 1), which also contain apparent features. The bumps in the single 
simulated spectra appear because in any spectrum which contains Poisson noise, the counts 
will not be distributed uniformly in energy, but will be clustered in energy simply due to 
counting statistics. 

3.1. Multi-Energy- Trial (Blind Search) Probabilities 

Since it was necessary to perform a blind-search for emission line features in the GRB 
spectrum - as the redshifted line energies were not known a priori, but were measured from 
the data - it is necessary to estimate the chance probability that the reported features are 
produced from a featureless spectrum during a blind search for such features. 

We produced 10000 MC realizations for each of models 1-6 as described in the 
previous section. We compared the C(E) of these between 0.4 and 1.5 keV against the 
single-energy-trial 99% and 99.9% confidence limits we found in the previous section, for 
the models 1-6 individually. 
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In Table 2 we list the fraction of the 10000 MC spectra in which C(E) in at least 
one PI bin reaches a single-energy-trial probability of 99% or 99.9% confidence. These 
fractions are 78-79% and 14-17%, respectively; if finding a single-energy-trial 99% (Si XIV) 
and 99.9% (S XVI) feature were statistically independent, then the probability of observing 
both a 99% and 99.9% single-energy-trial "line" in a single spectrum, such as we find in the 
present spectrum of GRB 011211, is ~10%. 

Therefore, in a blind-search of the EPIC/pn spectrum for emission features, we would 
expect to find features which have single-energy-trial significance equal or greater to those 
observed in one of approximately ten observed featureless spectra. 



4. Line Complex Significance As a Function of Redshift 

In this section, we examine if the reported lines, taken together, implicate Ka emission 
features from the particular redshift of z— 1.88 as reported by R02. We do so by summing 
the C(Ej/(l + z)), using the values of the rest energies of the reported lines: 



(4) 

where j denotes the PI bin number, Ej is the centroid energy of the jth PI bin, i denotes 
the index [1,5] of the five lines reported detected; Ei denotes the rest energies of the 
five lines identified by R02, which were 1.35 (Mg XI), 2.00 (Si XIV), 2.62 (S XVI), 3.32 
(Ar XVIII), and 4.10 keV (Ca XX). We examined the range of < z < 3, with a step-size of 
Az = 0.015. We use PI bins with energies 0.1-7 keV, to cover the spectrum past the rest 
frame energy of Ca XX. We find 638 counts in this energy range, of which we estimate 80 
are from background. We use only bins which are within 3a(Ei/ (1 + z)) of each Ei/ (1 + z). 
The result of this convolution, if the reported lines are real, should be a maximum in x( z ) 
near the optimal redshift value, in excess of that found from MC realizations of data with 
featureless spectra. 

The average value of x will systematically increase with increasing z as the lines are 
shifted to lower energies, where the intensity is higher in the power-law spectrum and the 
detector effective area is larger and so there are a greater number of counts. To examine 
if any particular maximum in x( z ) is significant, we performed this convolution for 10000 
MC realizations using the simulated spectral models 1-6, taking the 100th and 10th highest 
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values, as described in the previous section, to produce the 99% and 99.9% confidence limits 
respectively. 

The results of the calculation using all 5 reported lines, as well as the 99% and 99.9% 
MC confidence limits, are in Fig. 3. The value of x( z ) is in excess of the 99% MC confidence 
limit at z = [1.86 - 1.98] and z = [2.62 - 2.865], and in excess of the 99.9% MC confidence 
limit at z = 2.76. 

We also performed this convolution and Monte-Carlo simulation using what appear to 
be the most significant two lines from Fig. 2 of R02 (Si XIV and S XVI), the results of which 
are also shown in Fig. 3. The value of x( z ) is m excess of the 99% MC confidence limit at 
z = [1.155 — 1.275] and z = [1.80 — 2.01], and in excess of the 99.9% confidence limit at 
z = [1.86 - 1.95]. 

4.1. Multi-Redshift-Trial (Blind Search) Probabilities 

What fraction of featureless spectra, with the same number of source and background 
counts as the observed spectrum, would produce values of x( z ) of comparable significance 
to the excess in x( z = 1-88) from the observed spectrum? If one examines x{ z ) only at 
z = 1.88, the answer is <1%, which is the single- z-trial probability. However, the reported 
z = 1.88 is different from the known redshift of the host galaxy z = 2.14; it is therefore 
unlikely that z — 1.88 was the only redshift which would be considered consistent with an a 
priori model by R02. The pertinent statistical question to ask, then, is what is the fraction 
of featureless X-ray spectra, examined for a redshifted pair of S XVI and Si XIV lines, would 
produce a value of x( z ) comparable to the single- z-trial significance observed, allowing for a 
blind-search at values of z between 1.88 and 2.40 (a range of equal magnitude redshift and 
blueshift from the host galaxy)? 

To address this, we simulated 10000 MC spectra of each of spectral models 1-6, and 
found x( z ) m the same way as for the observed spectrum in the previous section. We used 
only the 2-line model, as this gave the apparently most significant result near z = 1.88. 
We compared x( z ) with the 99% and 99.9% MC limits, found in the previous section, and 
noted when these were exceeded in at least one z bin for the 99% confidence limit, and 
in at least seven consecutive z bins for the 99.9% confidence limit between z = 1.88 and 
z = 2.40. We require seven consecutive z bins as this is the number of z bins in x( z ) we fi n d 
in excess of the single-trial 99.9% confidence limit near z = 1.88. (We require only 1 bin for 
the 99% confidence limit to satisfy a minimal "detection" requirement; whereas we require 
seven bins for the 99.9% confidence limit, since this was what was actually observed near 



- 9- 



z = 1.88, and we wish to evaluate the likelihood of producing the observed x( z ) excess). 

The fraction of MC featureless spectra which contained at least 1 z bin between 
z = 1.88 and z = 2.40 in excess of the single- z-trial MC probability of 99% are given in 
Table 3. Because the observed spectrum gave x( z ) > 99.9% in seven consecutive z bins, 
we also used this as our criterion to count "hits" in the >99.9% confidence comparison, 
also shown in Table 3. Of 10000 MC spectra, between 20-22% produced "hits" for the 
single- z-trial 99% confidence limit, and 1.5-1.9% produced "hits" for the the single- z-trial 
99.9% confidence limit. 

We note that when we search the range z = 2. 14 ±1.0 (instead of ±0.26) the percentage 
of featureless spectra which have seven consecutive z bins with x( z ) greater than the 99.9% 
limit is 3.8-5.0%. However, it is unclear if R02 would have attached equal significance to a 
detection at z = 1.14 as one at z — 1.88, as no limits on excess line emission as a function 
of assumed redshift are given, and the redshift phase-space examined by R02 was not given. 
We therefore rely on our search of the smaller phase-space; while this may underestimate 
the number of "effective trials" used by R02, it nonetheless serves as the probability of 
producing the claimed excess line emission due a statistical fluctuation within the 5z = 0.26 
observed. If the redshift space examined by R02 were 1.14-3.14 (Sz = 1.0), then the 
probability of finding an excess equal or greater than that observed would be 3.8-5.0%. If 
the full redshift space of 0-5 was in fact examined by R02 then the probability of a false 
detection is >3.8-5.0%. 

5. A Practical Challenge with Monte Carlo A% 2 Tests for Multi-Parameter 

Models 

A MC Ax 2 test as employed by R02 is not fundamentally flawed as is the analytic A% 2 
test (that is, the F-test) for the application of spectral emission line discovery. In the F-test, 
the reference A% 2 distribution was derived under the assumption that the null hypothesis 
lies on the border of the acceptable parameter space (Protassov et al. 2002), which is not 
true in a search for emission lines; however, this assumption is not made in the MC A% 2 
test. Thus, the simulated A% 2 distribution can, in principle, provide a reliable reference 
distribution with which the value of A% 2 from application to real data can be compared to 
determine the false positive rate. 

However, as we show below, the MC A% 2 test as employed by R02 (and described 
more fully by R02b) suffers from a practical problem which makes it an inferior approach 
to the one we have applied. Specifically, to apply the A% 2 statistic using the MC approach, 
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one must assuredly find the global minimum x 2 f° r the applied model for every single MC 
realization; the description of the analysis performed by R02 (and R02b) does not assure 
that this has occurred. 

In the case of x 2 minimization through local mapping of the x 2 surface, as in the 
modified Levenberg-Marquart method employed in XSPEC (Arnaud 1996; modified from 
the CURFIT algorithm as described by Bevington 1969; see also Press et al. 1995), one 
finds the vector in multi-parameter space along this surface which provides the most 
negative derivative, follows along this vector a short way, and iterates, until one reaches a 
point where there are no negative derivatives in any direction along the x 2 surface (that 
is, when one has reached a minimum point. This approach suffers from the well known 
problem of local minima, where the true global minimum can lie at a completely different 
set of parameter values (see, for example Press et al. 1995, p. 394). On simple x 2 surfaces, 
where the second partial derivatives of the x 2 surface are everywhere small - certainly in 
the case of the 2- parameters power-law spectrum - it is rare that local minima different 
from the global minimum are found. However, on complex x 2 surfaces (those which contain 
large second partial derivatives of x 2 ) ~ as will be the case when fitting a six parameter 
model of three emission lines of specified rest energy with variable fluxes and redshift 
plus a power-law (slope and normalization) - local minima are common; subsequently, 
this approach is unsuited to the unassisted discovery (by computer alone, without human 
intervention) of the global x 2 minimum. It is, for example, common occurrence when 
using a multi-component (of more than, say, three) parameters in XSPEC that some final 
human assistance is required to find the global minimum, since almost always it is a local 
minimum which is found unassistedly by the computer; the quantitative difference in x 2 
between the computer-discovered local minimum and the true global minimum will depend 
on the complexity of the x 2 surface. In general, x 2 surfaces become more complex with the 
addition of more model parameters, and the discrepancy will be greater when there are 
greater covariances between model parameters (such as can be expected between flux for 
line 1 vs. line 2, or vs. the continuum, or for each of the lines and the continuum, or for 
the relative flux for the lines and the spectral slope; and so on). The only certain means 
to overcome this known deficiency is to evaluate x 2 on a parameter grid with resolution in 
each parameter dimension much smaller than ranges where the x 2 value changes by 1. 

Thus, while the global minimum will likely be found from unassisted discovery for the 
power-law spectral model, it is more likely that only a local minimum will be found for 
the six-parameter model when the search for this minimum is not human-assisted. This 
will underestimate the value of A% 2 for that realization; over the entire ensemble of MC 
realizations, there are then fewer false positives, and the significance of the A% 2 in support 
of the presence of lines will be overstated. In addition to the problem of local minima, 
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the local minimum found will be dependent upon initial parameter values (that is, the 
algorithm is path-dependent); it therefore does not lend itself to duplication by different 
groups. Also, the spectral fitting is non-analytic, such that the magnitude of a possible 
discrepancy cannot be evaluated a priori. 

XSPEC - which R02 states was used for the MC simulation - does perform the x 2 
minimization approach. We suggest that it is unlikely that human-assisted spectral fitting - 
as is common practice when attempting to find the global \ 2 minimum for a single spectrum 
in XSPEC - was performed for all 10,000 MC spectra by R02, as we found ourselves was 
necessary for the single spectral fit to the real data, as this would be an impractically long 
task. 

We are unable to attempt to duplicate the result of R02, because performing assisted 
spectral fitting on 10,000 MC spectra is impractical, and in any case, we know of no 
deficiency in our present approach. Our approach, in contrast, is analytic and not 
path-dependent and, therefore, more robust. 

6. Discussion and Conclusions 

We have attempted to confirm the observational statistical significance of emission 
lines in the X-ray afterglow of GRB 011211. In a blind-search for individual emission lines 
between 0.4 and 1.5 keV, features of significance equal to those observed will be found in 
one in ten featureless spectra. Thus, the reported features can be said to be detected with 
90% confidence in a model-independent way. 

Also, a blind-search for the two-line complex (Si XIV and S XVI) at any redshift 
between the reported value [z = 1.88) and a blueshift of equal magnitude from the host 
galaxy (z = 2.40) would find such features with equal significance to that observed in 
1 of 60 featureless spectra (1.3-1.7% of the time, depending on the intrinsic spectrum). 
Thus, the features as reported can be said to be detected with 98.3-98.7% confidence, in a 
model-dependent interpretation, where we search for two features due to Si XIV and S XVI 
Ka redshifted to some value of z in the range z =2.14±0.26. 

The difference between the present statistics and those of R02 are due to the different 
statistical arguments used to establish the existence of the emission lines. While R02 relies 
on single-trial statistics, we find the model used by R02 (Ka lines, at a redshift different 
from that of the host galaxy) was derived directly from the data, which therefore requires a 
statistical analysis appropriate to a blind search. By expanding the searched phase-space, 
and taking into account the multiple trials of a blind search, the confidence in the detection 
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drops from the 99.98% of R02 to, the 98.7% (best case) we find here. Moreover, R02 did 
not estimate the individual significances of the lines as we do here; thus we find that such 
"lines" would appear in between 15-78% of observed featureless spectra for a single-trial 
significance comparable to that of the reported Si XIV or S XVI lines. 

The analysis of these data has otherwise recently been called into question. Borozdin 
& Trudolyubov (2002) have shown that there is a background line associated with the 
EPIC/pn detector edge during the observation, which would have been included in the 
GRB spectrum from the first 5 ksec, when the source was near the detector edge, but 
not afterwards, after the source had been moved away from the detector edge. In our 
own analysis, we cannot confirm this result unless we adopt non-standard event selection 
criteria, which differ from the ones used by R02. R02 removed events near the CCD 
detector edge (FLAG==0) and selected only single and double events (PATTERN<=4) (J. 
Reeves, priv. comm.). These selections result in a smooth, featureless background spectrum 
with with no bright line-like feature near ~ 0.7 keV (as seen in Fig. 4f of Borozdin & 
Trudolyubov 2002) as well as a reduction of the count rate by a factor of ^ 2 in the range 
E = 0.2 — 3 keV (see Fig. 4). Therefore, we are not able to confirm the applicability of 
(Borozdin & Trudolyubov 2002) to the analysis of R02. 

An alternative approach to the one we have taken is employed using XSPEC, in which 
one fits a featureless spectrum to the data, and then a spectrum which includes emission 
lines, to determine if the change in xt * s significant, as according to an F-test; this is the 
approach taken by R02. However, this approach for the detection of emission or absorption 
lines is formally incorrect, and gives false statistical results (Protassov et al. 2002) 
particularly so when the true continuum is not well constrained, as in the present case. We 
therefore prefer our approach of applying a matched energy response filter for line detection 
at arbitrary energies, and to compare this with application of the matched filter to MC 
realizations of featureless spectra. It is a trivial statistical exercise to demonstrate that 
matched filtering maximizes the signal-to-noise ratio (and thus detectability) for detection 
of infinitely narrow emission lines. 

In estimating the model-dependent confidence limit for the detection of the line complex 
(98.7%), we accounted only for searching the redshift phase space between z = 1.88 and 
z = 2.40, symmetric about the host galaxy redshift - an extremely minimal requirement. 
We did not account for the full redshift phase space searched by R02, as such was not 
given in that reference; if the redshift phase space searched by R02 covered z — 1.14 — 3.14 
{z = 2.14±1.0), then the detection significance of the two strongest lines (Si XIV and S XVI) 
together decreases from 98.3-98.7% to 95-96.2% confidence. Finally, we did not include 
in this confidence limit the number of trials implicit in searching five X-ray spectra for 
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emission lines, which was performed by R02 for different time periods (0-5 ksec, 5-10 ksec, 
10-15 ksec, 15-20 ksec, and 20-27 ksec). If we presume the same search was made on all 
five spectra, as seems a reasonable a priori search to perform, then the detection confidence 
for the Si XIV and S XVI lines together decreases to 0.95 5 -0.962 5 =77-82%. We regard 
98.7% to be a conservative (in the sense of permitting a higher significance) upper-limit 
to the confidence of detecting the Si XIV and S XVI lines together, while a more accurate 
accounting of the number of trials and phase-space searched by R02 produces a 77-82% 
confidence limit. 

We consider neither a 90% confidence detection in a model-independent interpretation, 
nor a 98.3-98.7% confidence detection in a model-dependent interpretation, to be sufficient 
to justify the detection claims and subsequent interpretation put forth by R02. The 77-82% 
confidence limit, which accounts for the wide z-phase space and number of spectra examined 
by R02, is well below any comfortable detection confidence. If the z phase space actually 
searched by R02 is larger, the number of implicit trials is greater, and our estimate of the 
confidence level for the detected line complex would decrease. 

Moreover, if one concludes that the marginal detection of the 2 lines (Si & S) near 
z = 1.88 is significant, then one must also conclude that the detection of all 5 lines near 
z = 2.75 is equally significant. In addition, if one concludes that the marginal detection of 
the 5 lines near z = 1.88 is significant, then one must also conclude that the detection of 2 
lines (Si & S) near z — 1.2 is equally significant. 

Therefore, one cannot conclude simply that a complex of Ka line emission is detected 
near z = 1.88; these data permit alternate interpretations of such complexes near z — 1.2 
and z = 2.75. As the statistical excesses are due to the same "features" in the observed 
spectrum, the interpretation of the statistical excess in the context of the model presented 
by R02 is degenerate and cannot be resolved with these data alone. 

Prospects for confirmation of line features in GRBs are very good, considering that the 
X-ray spectral integration for GRB 011211 was begun 11 hours after the GRB was initially 
detected, and required 1.4 hrs of integration to obtain. Decreasing the reaction time would 
permit a longer integration, while the afterglow is brighter in the X-rays, and the marginal 
results found here may well be improved upon. 

We are grateful to J. Reeves, who generously made his observed spectrum of the 
first 5 ksec of the XMM-Newton observation of GRB 011211 available to us, that we 
might independently analyze it. We gratefully acknowledge useful conversations with A. 
MacFadyen, R. Blandford, and D. Fox. The authors are grateful to F. Harrison, F. Paerels 
and an anonymous referee for useful comments on the manuscript. MS was supported by 
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Fig. 1.— (Top Panel): X-ray spectrum from XMM-Newton EPIC/pn of GRB 011211, with 
energy binning optimized to find deviations from the best-fit power-law spectrum (solid line) 
due to emission lines at the reported energies, with a best-fit absorbed power-law model. 
The data show no significant excess counts at the reported line energies. (Bottom Panel): 
X = (model-data) /a, residuals between the best-fit continuum spectra and the data. The 
5-10 keV energy bin, while included in our spectral fits, is not included in this figure, to 
better show the 0.2-5 keV energy spectrum. 

Fig. 2. — (Panel a): Solid line is C(E) (Eq. 2) from the observed raw PI spectrum - the 
convolution between the raw spectrum and the EPIC/pn energy response. The broken lines 
are the max(C(E)) for spectral models 1-6, showing the 99% and 99.9% confidence single- 
trial upper-limits, (panels b-f): The solid line is the same observed convolved spectrum 
as in Panel a. Dotted lines are five (in the five separate panels) randomly selected Monte 
Carlo spectra using Model 1. Features of similar magnitude to those found in the observed 
spectra are apparent in each; these are due to the Poisson noise distribution (in energy) in 
a spectrum with a finite number of detected counts. 

Fig. 3. — (Top panel) Figure of merit x( z ) using all five reported line energies (solid line), 
for < z < 3. We also show the extremum Monte-Carlo values for 99% (long dashed) and 
99.9% confidence (short dashed), using all six model spectra. The solid vertical line marks 
the redshift of the reported detection (z=1.88); at this redshift, the x( z — 1-88). (Bottom 
Panel) Same, except for only Si XIV and S XVI lines. Again, the x( z — 1-88) is below the 
99% confidence limit. 
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Fig. 4. — The background spectrum near the region near the CCD chip edge using two 
different event selection criteria, (a) no explicit selection of PATTERN and FLAG as adopted by 
Borozdin & Trudolyubov (2002) (circles) binned at a minimum of 5 counts per bin and (b) 
using only PATTERN<=4 and FLAG=0 events as adopted by R02 (stars) with binnsizes identical 
to those of (a). 
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Table 1. Best-fit X-ray Spectral Parameters 



Model 


(10 22 cm- 2 ) 


1 k ^bremss 


N 

phot/keV/cm 2 /s at 1 keV 


Xl/dof 
(prob a ) 


(1) Best Fit 

2 

3 


n 1 o +0 - 04 

u - iu -0.02 

0.065±0.01 
0.16±0.02 


2.6±0.2 

(2.3) 

(3.0) 


(7.0±0.9)xl0~ 5 
(5.8±0.5)xl(r 5 
(8.5±0.7)xl0" 5 


1.24/17 (0.22) 
1.35/18 (0.14) 
1.35/18 (0.15) 


(4) Best Fit 

5 

6 


0.03±0.01 
0.05±0.01 
0.014±0.01 


Thermal Bremmstrahlung 
l-5±o:l (9.0±1.2)xl0" 5 

(i.i) (i2±i)xi(r 5 

(2.1) (7.0l°i)xl0- 5 


1.38/17 (0.13) 
1.50/18 (0.08) 
1.44/18 (0.10) 



Table 2. Fraction of Featureless MC Spectra which 
produce single-energy-trial "lines" at 99% and 99.9% 
confidence between 0.4 and 1.5 keV 



Model 


>99% (1 z bin) 


>99.9% (1 z bin) 


1 


0.78 


0.16 


2 


0.78 


0.15 


3 


0.78 


0.14 


4 


0.78 


0.15 


5 


0.79 


0.16 


6 


0.78 


0.17 



Table 3. Fraction of Featureless MC Spectra which 
produce single- z-trial x( z ) f° r 2-lines at 99% and 
99.9% confidence between z = 1.88 and z = 2.40 



Model 


>99% (1 z bin) 


>99.9% (7 z bins) 


1 


0.20 


0.015 


2 


0.20 


0.014 


3 


0.20 


0.014 


4 


0.21 


0.015 


5 


0.20 


0.013 


6 


0.22 


0.017 



